AITopics

2605.15822

Country:

Europe (1.00)
North America > United States (0.92)

Genre: Research Report (0.81)

Industry:

Energy (0.45)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.60)

Neural Information Processing SystemsApr-24-2026, 04:41:48 GMT

Supplementary Material for " Path following algorithms for ℓ2-regularized M-estimation with approximation guarantee "

Figure S2: Number of iterations at each grid point for the Newton and gradient descent methods applying to the ℓ2-regularized logistic regression over simulated data generated in Example 2. We summarize the results in Figure S1-S3. Figure S1 presents the results for ridge regression. In this case, the number of iterations by gradient method first increases and then stays flat as tk grows. Newton method, however, only takes one 1.51.5 iteration at each grid point. Moreover, the level of approximation (i.e., ϵ) seems to have no impact onthe number of iterations at each grid point, which is highly desirable.

artificial intelligence, machine learning, tmax, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Neural Information Processing SystemsFeb-10-2026, 12:01:36 GMT

Score-BasedDiffusionmeets AnnealedImportanceSampling

More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains oneofthemost effectivemethods formarginal likelihood estimation.

artificial intelligence, machine learning, tk 1, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Neural Information Processing SystemsFeb-9-2026, 11:18:11 GMT

64e52d01d26ad3914e556eeefb29a8ac-Paper-Conference.pdf

In machine teaching, a concept is represented by (and inferred from) a small number of labeled examples.

artificial intelligence, machine learning, teaching, (16 more...)

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 02:31:31 GMT

1df5c96e327ea0cbd32e0d8bae835994-Paper-Conference.pdf

We consider learning a sparse model from linear measurements taken by a network of agents.

artificial intelligence, machine learning, vt 2, (20 more...)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-8-2026, 05:35:26 GMT

Differentially-PrivateFederated LinearBandits

Next, we prove rigorous bounds on the cumulative group pseudoregret obtained byFEDUCB.

artificial intelligence, big data, data mining, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)

Neural Information Processing SystemsMar-13-2024, 01:13:37 GMT

Linear Multi-Resource Allocation with Semi-Bandit Feedback

We study an idealised sequential resource allocation problem. In each time step the learner chooses an allocation of several resource types between a number of tasks. Assigning more resources to a task increases the probability that it is completed. The problem is challenging because the alignment of the tasks to the resource types is unknown and the feedback is noisy. Our main contribution is the new setting and an algorithm with nearly-optimal regret analysis. Along the way we draw connections to the problem of minimising regret for stochastic linear bandits with heteroscedastic noise. We also present some new results for stochastic linear bandits on the hypercube that significantly improve on existing work, especially in the sparse case.

algorithm, allocation, bandit, (15 more...)

Country:

North America > Canada > Alberta (0.14)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Westenbroek, Tyler, Mazumdar, Eric, Fridovich-Keil, David, Prabhu, Valmik, Tomlin, Claire J., Sastry, S. Shankar

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

arXiv.org Machine LearningApr-6-2020

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities. However, the discrete-time and stochastic nature of these algorithms precludes the direct application of standard machinery from the adaptive control literature to provide deterministic stability proofs for the system. Nevertheless, we leverage these techniques alongside tools from the stochastic approximation literature to demonstrate that with high probability the tracking and parameter errors concentrate near zero when a certain persistence of excitation condition is satisfied. A simulated example of a double pendulum demonstrates the utility of the proposed theory. 1 I. INTRODUCTION Many real-world control systems display nonlinear behaviors which are difficult to model, necessitating the use of control architectures which can adapt to the unknown dynamics online while maintaining certificates of stability. There are many successful model-based strategies for adaptively constructing controllers for uncertain systems [1], [2], [3], but these methods often require the presence of a simple, reasonably accurate parametric model of the system dynamics. Recently, however, there has been a resurgence of interest in the use of model-free reinforcement learning techniques to construct feedback controllers without the need for a reliable dynamics model [4], [5], [6]. As these methods begin to be deployed in real world settings, a new theory is needed to understand the behavior of these algorithms as they are integrated into safety-critical control loops.

controller, tk 1, update rule, (17 more...)

2004.02766

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningJun-18-2019

Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond

Li, Xuechen, Wu, Denny, Mackey, Lester, Erdogdu, Murat A.

Sampling with Markov chain Monte Carlo methods typically amounts to discretizing some continuous-time dynamics with numerical integration. In this paper, we establish the convergence rate of sampling algorithms obtained by discretizing smooth It\^o diffusions exhibiting fast Wasserstein-$2$ contraction, based on local deviation properties of the integration scheme. In particular, we study a sampling algorithm constructed by discretizing the overdamped Langevin diffusion with the method of stochastic Runge-Kutta. For strongly convex potentials that are smooth up to a certain order, its iterates converge to the target distribution in $2$-Wasserstein distance in $\tilde{\mathcal{O}}(d\epsilon^{-2/3})$ iterations. This improves upon the best-known rate for strongly log-concave sampling based on the overdamped Langevin equation using only the gradient oracle without adjustment. In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic errors.

artificial intelligence, diffusion, machine learning, (16 more...)

1906.07868

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Gao, Xuefeng, Gurbuzbalaban, Mert, Zhu, Lingjiong

Breaking Reversibility Accelerates Langevin Dynamics for Global Non-Convex Optimization

arXiv.org Machine LearningDec-26-2018

Langevin dynamics (LD) has been proven to be a powerful technique for optimizing a non-convex objective as an efficient algorithm to find local minima while eventually visiting a global minimum on longer time-scales. LD is based on the first-order Langevin diffusion which is reversible in time. We study two variants that are based on non-reversible Langevin diffusions: the underdamped Langevin dynamics (ULD) and the Langevin dynamics with a non-symmetric drift (NLD). Adopting the techniques of Tzen, Liang and Raginsky (2018) for LD to non-reversible diffusions, we show that for a given local minimum that is within an arbitrary distance from the initialization, with high probability, either the ULD trajectory ends up somewhere outside a small neighborhood of this local minimum within a recurrence time which depends on the smallest eigenvalue of the Hessian at the local minimum or they enter this neighborhood by the recurrence time and stay there for a potentially exponentially long escape time. The ULD algorithms improve upon the recurrence time obtained for LD in Tzen, Liang and Raginsky (2018) with respect to the dependency on the smallest eigenvalue of the Hessian at the local minimum. Similar result and improvement are obtained for the NLD algorithm. We also show that non-reversible variants can exit the basin of attraction of a local minimum faster in discrete time when the objective has two local minima separated by a saddle point and quantify the amount of improvement. Our analysis suggests that non-reversible Langevin algorithms are more efficient to locate a local minimum as well as exploring the state space. Our analysis is based on the quadratic approximation of the objective around a local minimum. As a by-product of our analysis, we obtain optimal mixing rates for quadratic objectives in the 2-Wasserstein distance for two non-reversible Langevin algorithms we consider.

diffusion, eigenvalue, langevin dynamic, (17 more...)

1812.07725

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > New York > Nassau County > Mineola (0.04)
(6 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)